continuous embedding and back
Review for NeurIPS paper: From Trees to Continuous Embeddings and Back: Hyperbolic Hierarchical Clustering
Additional Feedback: Q1.In the end-to-end training section, do the authors learn embeddings by clustering all points together? As in are train, test, and dev points all clustered together or are each of them clustered separately? If all the points are clustered separately then it might not be a reasonable thing in practice because in practice, we do not have access to test data while training, and nor should any test data be used for doing any sort of training. If authors perform some clustering on test points as well, then it might not be reasonable to assume access to *all* test data at test time. Evaluation on test data should preferably be possible even when test data arrives in an online fashion.
Review for NeurIPS paper: From Trees to Continuous Embeddings and Back: Hyperbolic Hierarchical Clustering
Throughout discussion among reviewers with the author response, all reviewers agree with the novelty and the significance of the theoretical contribution of this paper, which provides approximation guarantees of the proposed embedding. While reviewers raised a concern about empirical performance regarding with computational cost and parameter tuning, they are common problems for other clustering approaches and are not crucial problems of the proposal. Hence I recommend acceptance of this paper.
From Trees to Continuous Embeddings and Back: Hyperbolic Hierarchical Clustering
Similarity-based Hierarchical Clustering (HC) is a classical unsupervised machine learning algorithm that has traditionally been solved with heuristic algorithms like Average-Linkage. Recently, Dasgupta reframed HC as a discrete optimization problem by introducing a global cost function measuring the quality of a given tree. In this work, we provide the first continuous relaxation of Dasgupta's discrete optimization problem with provable quality guarantees. The key idea of our method, HypHC, is showing a direct correspondence from discrete trees to continuous representations (via the hyperbolic embeddings of their leaf nodes) and back (via a decoding algorithm that maps leaf embeddings to a dendrogram), allowing us to search the space of discrete binary trees with continuous optimization. Building on analogies between trees and hyperbolic space, we derive a continuous analogue for the notion of lowest common ancestor, which leads to a continuous relaxation of Dasgupta's discrete objective.